AITopics | double descent

Collaborating Authors

double descent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Rigorous, Tractable Measure of Model Complexity

Allerbo, Oskar, Schön, Thomas B.

arXiv.org Machine LearningMay-21-2026

One of the most fundamental properties of a machine learning model is its complexity, with applications across topics such as interpretation, generalization, and model selection. Despite its importance, there is no canonical, model-agnostic way to assess a model's complexity. While simple heuristics, such as the number or magnitude of parameters, yield very crude estimates, hyperparameter-based approaches, such as polynomial degree or kernel length scale, do not generalize across model classes. More rigorous methods, including the Vapnik-Chervonenkis dimension (VCD) (Vapnik, 2013), Rademacher complexity (RMC) (Bartlett and Mendelson, 2002), and effective number of parameters (or effective degrees of freedom, ENP) (Efron, 1986), are difficult, or even impossible, to compute in practice, leaving the user to resort to crude bounds and/or approximations. The topic is further complicated by the often overlooked distinction between model and function complexity, where the former sets a ceiling on the latter.

artificial intelligence, complexity, machine learning, (18 more...)

arXiv.org Machine Learning

2605.21167

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

On the Double Descent of Random Features Models Trained with SGD

Neural Information Processing SystemsApr-28-2026, 04:24:45 GMT

We study generalization properties of random features (RF) regression in high dimensions optimized by stochastic gradient descent (SGD) in under-/overparameterized regime. In this work, we derive precise non-asymptotic error bounds of RF regression under both constant and polynomial-decay step-size SGD setting, and observe the double descent phenomenon both theoretically and empirically. Our analysis shows how to cope with multiple randomness sources of initialization, label noise, and data sampling (as well as stochastic gradients) with no closedform solution, and also goes beyond the commonly-used Gaussian/spherical data assumption. Our theoretical results demonstrate that, with SGD training, RF regression still generalizes well for interpolation learning, and is able to characterize the double descent behavior by the unimodality of variance and monotonic decrease of bias. Besides, we also prove that the constant step-size SGD setting incurs no loss in convergence rate when compared to the exact minimum-norm interpolator, as a theoretical justification of using SGD in practice.

artificial intelligence, machine learning, neural information processing system, (15 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Add feedback

Deep Learning Through AT elescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond Alan Jeffares

Neural Information Processing SystemsFeb-18-2026, 10:02:23 GMT

Deep learning sometimes appears to work in unexpected ways.

artificial intelligence, machine learning, neural network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

aec5e2847c5ae90f939ab786774856cc-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 13:37:43 GMT

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > Mexico > Yucatán > Mérida (0.04)
Asia > Pakistan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

e271e30de7a2e462ca1f85cefa816380-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 10:58:07 GMT

Most of this work was done when Fanghui was at KULeuven.

artificial intelligence, machine learning, xt 1, (18 more...)

Neural Information Processing Systems

Country: Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

e271e30de7a2e462ca1f85cefa816380-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 10:58:04 GMT

descent, neural information processing system, neural network, (13 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

f754186469a933256d7d64095e963594-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 23:26:42 GMT

eigenvalue, model size, neural network, (13 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Model,sample,andepoch-wisedescents: exact solutionofgradientflowintherandomfeaturemodel

Neural Information Processing SystemsFeb-10-2026, 20:00:29 GMT

This important phenomenon commonly appears in implemented neural network architectures, and also seems to emerge in epoch-wise curves during the training process.

artificial intelligence, descent, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

LeastSquaresRegressionCanExhibit Under-ParameterizedDoubleDescent

Neural Information Processing SystemsFeb-10-2026, 08:59:30 GMT

This paper demonstrates interesting new phenomena that suggest that our understanding of the relationship between the number ofdata points, the number ofparameters, and the generalization errorisincomplete,evenforsimplelinearmodels.

artificial intelligence, machine learning, trn, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

double descent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Rigorous, Tractable Measure of Model Complexity

On the Double Descent of Random Features Models Trained with SGD

4ae67a7dd7e491f8fb6f9ea0cf25dfdb-Paper.pdf

Deep Learning Through AT elescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond Alan Jeffares

aec5e2847c5ae90f939ab786774856cc-Paper-Conference.pdf

e271e30de7a2e462ca1f85cefa816380-Supplemental-Conference.pdf

e271e30de7a2e462ca1f85cefa816380-Paper-Conference.pdf

f754186469a933256d7d64095e963594-Paper.pdf

Model,sample,andepoch-wisedescents: exact solutionofgradientflowintherandomfeaturemodel

LeastSquaresRegressionCanExhibit Under-ParameterizedDoubleDescent